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Strengths and Weaknesses of Quantum Fingerprinting 
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\^ ■ Abstract 

, We study the po"wer of quantum fingerprints in the simultaneous message passing (SMP) 

setting of communication complexity. Yao recently sho"wed ho"w to simulate, "with exponential 
overhead, classical shared-randomness SMP protocols by means of quantum SMP protocols 
' "without shared randomness ((5"-protocols). Our first result is to extend Yao's simulation to the 

strongest possible model: every many-round quantum protocol "with unlimited shared entangle- 
ment can be simulated, "with exponential overhead, by Q'l-protocols. We apply our technique to 
, obtain an efficient Q'l-protocol for a function -which cannot be efficiently solved through more 

restricted simulations. Second, "we tightly characterize the power of the quantum fingerprinting 
technique by making a connection to arrangements of homogeneous halfspaces with maximal 
^ ' margin. These arrangements have been well studied in computational learning theory, and we 

fT^ , use some strong results obtained in this area to exhibit weaknesses of quantum fingerprinting. 

' In particular, this implies that for almost all functions, quantum fingerprinting protocols are 

. exponentially worse than classical deterministic SMP protocols. 

o 

^ ! 1 Introduction 



(N 



1.1 Setting 



^ , This paper studies the power of quantum fingerprinting protocols in communication complexity. In 

■ the simultaneous message passing (SMP) setting, Alice and Bob hold inputs x and y, respectively, 

and each send a message to a third party, usually called the "referee". The referee holds no input 
himself, but is supposed to infer some function /(x, y) from the messages he receives. The goal is 
to minimize the amount of communication sent for the "worst-case input x, y. In this model there 
^ ' is no direct communication bet"ween Alice and Bob themselves, unlike in the standard model of 

one-"way or multi-round t"wo-party communication complexity. The SMP model is arguably the 
"weakest setting of communication complexity that is still interesting. 

We "will consider SMP quantum protocols "where Alice sends a g-qubit state \ax), Bob sends a 
(7-qubit state \f3y), and the referee does the 2-outcome "s"wap test" |B(;WWnij . This test outputs 
with probability 
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Estimating this probability is tantamount to estimating the absolute value of the inner product 
{ax\Py)- They repeat this r times in parallel, the referee uses the r bits that are the outcomes of 
his r swap tests to estimate |(aa;|/3j;)|, and bases his output on this estimate. We will call such 
protocols "repeated fingerprinting protocols" . 

A quantum protocol of this form can only work efficiently if we can ensure that |(a2^|/?y)p < 5q 
whenever f{x,y) = and Ka^^l/Jy)!^ > 5i whenever f{x,y) = 1. Here 60 < 61 should be reasonably 
far apart, otherwise r would have to be too large to distinguish the two cases with high probability. 
A statistical argument shows that r = — 5o)^) is necessary and sufficient for this. In 

total, such a protocol uses 2qr = 0{q/{5i — (5o)^) qubits of communication. Generally a protocol is 
considered "efficient" if its communication cost is poly logarithmic in the input length. Even though 
quantum fingerprinting is a restricted model, it is the only technique we know to get interesting 
quantum protocols in the SMP model. 

A bit of notation before we get into the study of quantum fingerprinting: we use -R" (/) to 
denote the minimal cost among all classical SMP protocols that compute / with error probability 
at most 1/3 on all inputs. Replacing superscript '||' by '1', or removing this superscript altogether, 
give respectively one-way and multi-round communication complexity in the standard two-party 
model without the referee. Adding superscripts 'pub' or 'ent' indicates that Alice and Bob share 
unlimited amounts of shared randomness or shared entanglement. These shared resources do not 
count towards the communication cost. Replacing 'i?' by 'Q' gives the variants of these measures 
where the communication consists of qubits instead of classical bits. 

1.2 Strengths of quantum fingerprinting 

Quantum fingerprints have surprising power. They were first used by Buhrman et al. |B('WWnij 
to show QII(EQ) = O(logn) for the n-bit equality function. In contrast, it is known that i?ll(EQ) = 
e(Vn) |Amb96llN?^IBK97j . while ^^"'^"''(EQ) = 0(1). Subsequently, Yao |Yaon3j showed that 

Qll(/) = 20(«"""''(/))logn. 

In particular, if = 0(1) then = O(logn). The quantum fingerprinting protocol for 

equality is a special case of this result. Yao's exponential simulation can be extended to relational 
problems, and recently Gavinsky et al. GKRW06 showed that it is essentially optimal by exhibiting 
a relational problem Pi for which = O(logn) and QII(Pi) = r2(n^/^). Whether there 

exist exponential gaps for functional problems remains open. 

In this paper we show that Yao's simulation can be extended far beyond classical SMP protocols. 
Given any bounded-error two-party quantum protocol with q qubits of communication, no matter 
how many rounds of communication, and no matter how much entanglement it starts with, we 
show how to construct a repeated quantum fingerprinting protocol that communicates 2^^'^^ logn 
qubits and computes the same function with small error probability. In symbols: 

Qll(/) = 20W™'(/)) logn. 

Thus, the exponential simulation still works even if we add interaction, quantum communication, 
and entanglement to the i?ll'P"*-model that Yao considered. When we restrict to simulating 
protocols, we get a bound that is quadratically better than Yao's. A similar quadratic improvement 
over Yao's has been obtained independently by Golinsky and Sen |GS03j . 
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Actually, the vectors that we construct for our quantum simulation can also be used to obtain 
a classical SMP protocol with shared randomness and 0{r) bits of communication (r being the 
number of repetitions of the quantum protocol), as follows. Alice and Bob use their shared ran- 
domness to pick an 0(l)-dimensional random subspace and each projects her/his vector onto that 
space and renormalizes. The expectation of the inner product of the two projected vectors equals 
their original inner product. They send the resulting 0(l)-dimensional vectors to the referee in 
sufficient precision (O(logr) bits per entry suffices), and repeat this 0{r) times to approximate the 
inner product between the original vectors with sufficient precision. Hence our construction implies 
Shi's result 

This is not too surprising, because our derivation of the appropriate vectors (fingerprints) from 
the Q^'^^-protocol is inspired by some of the techniques in Shi's paper — though we avoid his use of 
tensor norms. 

The fact that our simulation has exponential overhead is unfortunate but unavoidable. For 
instance, for Raz's function |E,az99j we have Q{f) = O(logn) via a two-round protocol while it 
is easy to see that any quantum fingerprinting protocol needs to communicate n^^^^ qubits: by 
the argument of the last paragraph, a quantum fingerprinting protocol implies a classical shared- 
randomness protocol of roughly the same complexity, and Raz proved that all classical protocols for 
his problem require n^^^^ bits of communication. Despite the exponential overhead, our simulation 
still gives nontrivial efficient Ql'-protocols when simulating protocols with O(loglogn) quantum 
communication and much shared randomness or entanglement. We give an example in Section [2.31 

1.3 Characterization and weaknesses of quantum fingerprinting 

The results above show some of the strengths of quantum fingerprinting protocols. What about 
its weaknesses? For instance, is it possible that quantum SMP protocols based on repeated finger- 
printing are equal in power to arbitrary quantum SMP protocols? In Section |31 we show that for 
most functions they are much weaker. 

Our main tool is a tight characterization of quantum fingerprinting systems in terms of the 
optimal margin achievable by realizations of the computational problem via an arrangement of ho- 
mogeneous halfspaces (Theorem El- The latter mouthful has been well studied in machine learning, 
and forms the basis of maximal- margin classifiers and support vector machines. This connection 
between quantum fingerprints and these embeddings is straightforward, but allows us to tap into 
some of the strong theorems known about such margins, particularly a result of Forster IForOlj and 
its recent strengthening by Linial et al. |LMSS05] . The upshot is that repeated quantum finger- 
printing protocols are exponentially worse than general quantum and even classical SMP protocols 
for almost all functions. 

This three-way connection between quantum communication complexity, margin complexity, 
and learning theory allows us to make other connections as well. For example, good learning proto- 
cols give good lower bounds on margins, which give new upper bounds for repeated fingerprinting 
protocols. In the other direction, an efficient multi-round quantum protocol for some Boolean 
function implies lower bounds on the margin of the corresponding matrix. We give an example 
of this in Section 13.21 Finally, since our positive result above relates quantum fingerprinting to 
general Q'^'^^-complexity, we can also use known results about margin complexity to obtain some 
new lower bounds on Q^^^{f). We explore the latter direction in Section 1231 There we show 



3 



Qent^j-j _ f2(log(l/7(/)), where 7(/) is the "maximal margin" among all embeddings of /. This 
bound was independently obtained by Linial and Shraibman [LSOOj in a recent manuscript, which 
also shows the beautiful new result that margin complexity and discrepancy are linearly related. 

2 Simulating Arbitrary Quantum Protocols 

In this section we show how to extend Yao's simulation from classical SMP protocols with shared 
randomness to multi-round quantum protocols with shared randomness (Section 12. and then 
even to arbitrary multi-round quantum protocols with shared entanglement (Section l2.2j) . 

2.1 Simulating shared-randomness multi-round quantum protocols 

Let / : {0, 1}" X {0, 1}" {0, 1} be a communication complexity problem. Our construction also 
works for promise functions, but for simplicity we describe it here for a total function. Let P be 
the 2" X 2" matrix of acceptance probabilities of a bounded-error quantum protocol for /. We first 
assume the protocol communicates q qubits and doesn't use prior shared entanglement or shared 
randomness. It is known |Yao93l FKreOSj that we can decompose P = AB^ where A, B are 2" x 2^'^"^ 
matrices, each of whose entries has absolute value at most 1, and B'^ is the conjugate transpose of 
B. Let a{x) be the x-th row of A and h{y) be the y-th row of B. Then for all x, y we have 



Now consider a quantum protocol that uses shared randomness. By Newman's theorem jNew91j . 
we can assume without loss of generality that the shared random string r is picked uniformly from 
a set R of 0{n) elements. Then we can decompose 



where Pr is the matrix of probabilities if we run the protocol with shared string r. Each Pr induces 
vectors ar{x),br{y) as above, and we have 



where '~' means that f{x,y) and P{x,y) difi^er by at most the error probability of the protocol. 
Define pure {q + logn + 0(l))-qubit states as follows 



P{x,y) = {a{x)\b{y)) and || a(x) ||, || b{y) \\ < 2i~\ 






and 
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where 'junk^' and 'junk^' are distinct special basis states. Note that 



Using these states gives a repeated quantum fingerprinting protocol that computes / with small 
error and sends 0(2'^'' log n) qubits of communication, without shared randomness. 

Theorem 1 = 0(28Q''"'(^) logn). 

Note that we put log n instead q+log n for the last factor. That is clearly correct if g < (log n)/8; 
and if g > (logn)/8 then the righthand side is more than n, which is a trivially true upper bound 

on gii(/). 

We can get a better exponent in the case of classical one-way protocols. Suppose Alice's classical 
message is c = R}'^'^^{f) bits. Let ar{x) G {0, l}^'^ have a 1 only in the coordinate corresponding 
to the message Alice sends given input x and random string r. Let hr{y) G {0,1}^'' be 1 on 
the messages of Alice that lead Bob to output 1 (given y and r). Then Pr{x,y) = {ar{x)\br{y)) , 
II ar{x) II = 1 and || br{y) \\ < The above fingerprinting construction now gives a protocol 

with 0(2^'^logn) qubits. 

Theorem 2 QII(/) = 0(22-f?''''"'(/) logn). 

Analogously we can simulate classical shared-randomness SMP protocols. Suppose Alice's mes- 
sages are c < iill'P"^(/)/2 bits long. This gives rise to a repeated fingerprinting protocol with 
0(2^'^logn) qubits of communication: define ar{x) as before and let br{y) G {0,1}^" be 1 on the 
possible messages a of Alice that would lead the referee to accept given a and the message Bob 
would send (on his input y and random string r). This bound is quadratically better than Yao's 
simulation of classical SMP protocols. 

Theorem 3 Q^f) = 0(2^"'''"'(^) logn). 

2.2 Simulating shared-entanglement multi-round quantum protocols 

Now consider the case where our multi-round quantum protocol uses q qubits of communication 
and some entangled starting state. Our proof for this most general case is inspired by Shi's result 
Rlpnb^j^l ^ 20{Q'^Hf)) Shin5t Theorem 1.2]. The following lemma is due to Razborov |H,a,7,n3l 
Proposition 3.3] and is similar to earlier statements in |iYao93. iKre95j . It can be proved by induction 
on q. 

Lemma 1 (Kremer-Razborov-Y"ao) Let \^) denote the (possibly entangled) starting state of 
the protocol. For all inputs x and y, there exist linear operators Ah{x), Bh{y), h G {0, 1}'^"^, each 
with operator norm < 1, such that the acceptance probability of the protocol is 

P{x,y) = \\ (M^) ^ Bh{y)m \f . 

h&{0,l}i-^ 
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We will derive vectors a{x) and b{y) from this characterization. Assume without loss of gener- 
ality that the prior entanglement is 

|^') = ^A,|e)|e), 

with {|e)} an orthonormal set of states and Ag = 1. Note that \E\ may be huge. Now we can 
write 

P{x,y) = \\ (Mx) ^ Bh{y))\^) \f = J2 X,,{e\Ah{xyAh'{x)\e')-Xe{e\Bh{y)^Bh'{y)\e'). 

he{0,l}i-^ h,h',e,e' 

Define a{x) to be the |£'p2^'^~ ^-dimensional vector with complex entries Xe'{e\Ah{x)^ Ah'{x)\e'), 
indexed by tuples {h,h' ,e,e'), and similarly define b{x) with entries Ae(e|i?/i(?/)^i?/i/(?/)|e'). Then 

P{x,y) = {a{x)\biy)). 

Using that the set of |e)-states is an orthonormal set in the space in which Afi[x)^ Ahi {x)\e') lives, 
and the fact that || Afi{x)^ Ah' {x) \\ < \\ Afi{x) \\ ■ \\ Ah'{x) || < 1 we have 

II a(x)f= Yl >^l\{e\Ah{x)^An>{x)\e')\^< J] A^, || A(x)U,.(x)|e') f < ^e' = ■ 

h,h',e,e' h,h',e' h,h',e' 

Similarly || b{y) \\ < 29-1. 

The norms and inner products of the a{x) and b{y) vectors are thus as before. It remains to 
reduce their dimension D = |i?p2^'?-^, which may be very large. For this we use the Johnson- 
Lindenstrauss lemma (proved in |JL84j . see e.g. |DG99j for a simple proof). 

Lemma 2 (Johnson & Lindenstrauss) Let s > and d > 41n(A^)/(e^/2 — e^/3). For every set 
V of N points in there exists a map p : such that for all u,v € V 

(1 - e)|| n - t> f < II p{u) - p{v) f < (1 + e)|| n - t; f . 

To get the above map p, it actually suffices to project the vectors onto a random d-dimensional 
subspace and rescale by a factor of sjD jd. With high probability, this approximately preserves all 
distances. Note that if the set V includes the 0-vector, then also the norms of all v G ^ will be 
approximately preserved. Since 

II l|2 , II ||2 II ii2 



the map / also approximately preserves the inner products between all pairs of vectors in if e 
is sufficiently small. 

We assume for simplicity that our vectors a{x) and 6(y) are real. Let our set V contain all a{x) 
and b{y) as well as the 0-vector (so = 2 • 2" + 1). Applying the Johnson-Lindenstrauss lemma 
with £ = 1/(10 • 2^'^) and d = 0(log(A^)/e^) = 0(n2^'?) gives us d-dimensional vectors p{a(x)) and 
p{b{y)) of norm at most 2"^ such that 

\{p{a{x))\p{b{y))) - {a{x)\b{y))\<l/m. 
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We fix these vectors once and for all before the protocol starts; note that we are not using shared 
randomness in the protocol itself.^ 

Now define quantum states in d + 2 dimensions by 



|p(a(x))) + y22^HbK207f ljunkj 



21 



and 



Note that 



,^ ^ _ \v{h{y))) ^^2^1- II v(h{y)) f [junk^) 
\Py) - 21 

{p{a{x))\p{b{y))) ^ {a{x)\b{y)) _ 1 

Hence these states form a repeated fingerprinting protocol with fingerprints of \og{d+ 2) = 0{q + 
logn) qubits and 0(2^'') repetitions. 

Theorem 4 Q\\{f) = 0(2SQ'"*(/) logn). 



2.3 An example problem 

Here we apply Theorem |1] to obtain an efficient SMP protocol for a particular problem; we do not 
know how to obtain an efficient protocol for this problem without using Theorem^] More precisely, 
we give an example of a Boolean function / for which there exists a 4-round quantum protocol 
that uses q = O(loglogn) qubits of communication and O(logn) bits of shared randomness. Our 
simulation implies the existence of an efficient quantum SMP protocol for /: 

< 20(iog log «) logn = (logn)^(i). 

The problem uses many small copies of Raz's 2-round communication problem from |R,az99j . and 
is defined as follows. 

Alice's input: string x G {0, l}'^, unit vectors wi, . . . G M"*, and m/2-dimensional 
subspaces 5i , . . . , 5^ of 

Bob's input: string y G {0, 1}^, and m-dimensional unitaries Ui, . . . ,Uk 

Promise: |x © y| = A;/ log log k, and either 

(/ = 0) UiVi G Si for each i where Xj © = 1, or 

(/ = 1) UiVi G S^- for each i where Xj © = 1 

As stated this is a problem with continuous input, but we can easily approximate the entries of 
the vectors, unitaries, and subspaces by 0(logm)-bit numbers. Thus the input length is n = 
0{km? logm) and we choose m = log k. 

Here's a simple 4-round protocol for this problem. First, Alice and Bob use shared randomness 
to pick 0(loglogA;) indices i G [k]. Alice sends the corresponding Xi to Bob, Bob sends the 
corresponding yi to Alice. They pick the first index i such that xi ® yi = 1 (there will be such 
an i in their 0(log log fc)-set with high probability). Then Alice sends Vi to Bob as a logm- 
qubit state. Bob applies Ui and sends back the result UiVi, which is another logm qubits. Alice 

^Using shared randomness gives us the result _RII'''"''(/) = 2°''5''"*'-''^) of |Shi05l Theorem 1.2]. 
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measures with subspace Si versus S^- and outputs the result (0 or 1). The overall communication 
is 2 log log k + 2 log m = 0(log log n) . 

Note that we need both shared randomness and multi-round quantum communication to achieve 
Qpnb^f-^ = O(loglogn), and hence to achieve = (logn)O(^) via our simulation. In contrast, 

Yao's simulation from |Yao03j cannot give us an efficient -protocol. This is because every classi- 
cal many-round protocol (including SMP shared-randomness ones) for even one instance of Raz's 
problem needs about ^/rn ~ \/log n bits of communication jRaz99j . The same lower bound then 
also holds for the classical SMP model with shared randomness. Hence the best Q" -protocol that 
Yao's simulation could give is 2^^^'^ logn w 2^'°^'". 

3 Characterization of Quantum Fingerprinting 

As mentioned, all nontrivial and nonclassical quantum SMP protocols known are based on repeated 
fingerprinting. Here we will analyze the power of protocols that employ this technique, and show 
that it is closely related to a well studied notion from computational learning theory. This addresses 
the 4th open problem Yao states in [YaoDSj . In particular, we will show that such quantum 
fingerprinting protocols cannot efficiently compute many Boolean functions for which there is an 
efficient classical SMP protocol. 

3.1 Embeddings and realizations 

We now define two geometrical concepts. 

Definition 1 Let f : T> ^ {0, 1}, with T> <^ X x Y , be a (possibly partial) Boolean function. 
Consider an assignment of unit vectors ax G (3y G to all x ^ X and y G Y. 

This assignment is called a (d, <^i)-threshold embedding of / if \{ax\(3y)\'^ < Sq for all {x,y) E 
/-i(0) and > 6i for all {x,y) G f-\l). 

The assignment is called a d-dimensional realization of / with margin j > if {ax\f3y) > 7 for 
all {x,y) e /-i(0) and {ax\(3y) < -7 for all {x,y) £ f~\l). 

Our notion of a "threshold embedding" is essentially Yao's jYaon3| Section 6, question 4], except 
that we square the inner product instead of taking its absolute value, since it is the square that 
appears in the swap test's probability. Clearly, threshold embeddings and repeated fingerprinting 
protocols are essentially the same thing (with fingerprints of logd qubits, and 0{l/{6i — 60)'^) rep- 
etitions). The notion of a "realization" is computational learning theory's notion of the realization 
of a concept class by an arrangement of homogeneous halfspaces. 

These two notions are essentially equivalent: 

Lemma 3 If there is a {d, 5q, 61) -threshold embedding of f , then there is a (d^ -|- 1)- dimensional 
realization of f with margin 7 = (5i — 6q)/{2 + 5i + 5q). 

Conversely, if there is a d-dimensional realization of f with margin 7, then there is a {d + 
1, 60, 61) -threshold embedding of f with 60 = (1 —7)^/4 and 61 = (1 -1-7)^/4. 

Proof. Let Ux, Py be the vectors in a {d, 5o, (5i )-threshold embedding of /. For a = {61 + So)/{2 + 
61 + 60), define new vectors a'x = {\/a, \/l — a ■ ax ® a^) and fi'y = {^/a^ — \/l — a ■ (5y ® (3y). These 
are unit vectors of dimension d'^ + 1. Now 

{a'x\(3')=a-{l-a)\{ax\f3y)\'. 
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If {x,y) G / ^(1), then |(ax|/3j/)P > and hence {a'^\f3y) < o — (1 — a)Si = —7. Similarly, 
(«'J/3;) > 7 for {x,y)ef-HO). 

For the converse, let ax,f3y be the vectors in a d-dimensional realization of / with margin 7. 
Define new {d + l)-dimensional unit vectors a'^ = (1, ax)/V^ and P'y = (1, —Py)/y/2. Now 

i«i/3;>p = ^(i-Ki/3.)f- 

If {x,y) G /~^(1), then {ax\(3y) < —7 and hence |(ax|/9^,)|2 — i (-'- +7)^ = '^i- ^ similar argument 
shows Ka;|/3;)P<i (1-7)2 = Jo for (x,y)e/-i(0). ' □ 

The tradeoffs between dimension d and margin 7 have been well studied |For01l IFKL+Oll 
IFSSS03ULMSS05] . In particular, we can invoke a very strong bound on the best achievable margin of 
realizations due to very recent work by Linial et al. |LMSS05l Section 3.2] (our 7 is their l/mc{M)). 

Theorem 5 (Linial et al.) For f : X x Y ^ {0,1} , define the \X\ x \Y\-matrix M by Mxy = 
(_X)/(^''f). Every realization of f (irrespective of its dimension) has margin 7 at most 

Kg-\\M 11^ . 

where the norm \\ M ||^^_^^^ is given by \\ M H^^^^^ = supy^ij^ \\ Mv ||^^ and 1 < Kq < 1.8 is 
Grothendieck 's constant. 

This bound is the strongest known upper bound for the margin of a sign matrix. It strengthens 
the previously known bound due to Forster |For01j : 

Corollary 1 (Forster) Every realization of f (irrespective of its dimension) has margin 7 at 
most 7 < II M ||/-y/|X| • |y|, where || M || is the operator norm (largest singular value) of Al. In 
particular, if f : {0, 1}" x {0, l}" {0, 1} is the inner product function, then || M || = -v/2" and 
hence 7 < l/\/2". 

Combining this with LemmaEl we see that a (d, 5i , Jq) -threshold embedding of the inner product 
function has 61 — 60 = 0(l/\/2"). In repeated fingerprinting protocols, we then need r ~ 2" different 
swap tests to enable the referee to reliably distinguish 0- inputs from 1-inputs! Hence if we consider 
the function f{x, y) defined by the inner product function on the first logn bits of x and y, there is 
an efficient classical SMP protocol for / (Alice and Bob each send their first logn bits), but even 
the best quantum fingerprinting protocol needs to send Q{n) qubits. The same actually holds for 
almost all functions defined on the first log n bits. This indicates an essential weakness of quantum 
fingerprinting protocols. 

In general, the preceding arguments show that we cannot have an efficient repeated fingerprint- 
ing protocol if / cannot be realized with large margin. If the largest achievable margin is 7, the 
protocol will need 0(1/72) copies of |a^) and \Py). We now show that this lower bound is close to 
optimal. Consider a realization of / : X x y — > {0, 1} with maximal margin 7. Its vectors may 
have very high dimension, but nearly the same margin can be achieved in fairly low dimension if 
we use the Johnson-Lindenstrauss lemma |.TL84) . Assume without loss of generality that \X\ > \Y\ 
and let n = log |Ar|. 
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Lemma 4 A D- dimensional realization of f with margin 7 can be converted into an 0(n/7^)- 
dimensional realization of f with margin 7/2. 

Using Lemma 121 this gives us a (d, 5i, (5o)-threshold embedding of / with d = 0(n/7^), 5o = 
(1 — 7/2)^/4 and (5i = (1+7/2)^/4. Note that 61 — 60 = 7/2. This translates directly into a repeated 
fingerprinting protocol with states \ax) and \Py) of d dimensions, hence 0(log(n/7^)) qubits, and 
r = 0(1/7^). For example, if / is equality then 7 is constant, which implies an 0(logn)-qubit 
repeated fingerprinting protocol for equality (of course, we already had one with r = 1). In sum: 

Theorem 6 For f : X x Y ^ {0,1} with 2" = \X\ > \Y\, define the \X\ x \Y\-matrix M by 
^xy = (— l)-^*-^'^^, and let 7 denote the largest margin among all realizations of M. There exists 
a repeated fingerprinting protocol for f that uses r = 0(1/7^) copies of 0{\og{n/^'^))-qubit states. 
Conversely, every repeated fingerprinting protocol for f needs ri(l/7^) copies of its \ax) and \j3y) 
states. 



3.2 Application: getting margin lower bounds from communication protocols 

The connection between repeated fingerprinting and maximum margin of a realization can be 
exploited in the reverse direction as well, by deriving new lower bounds on margin complexity from 
known communication protocols. Yao |Yao03j considered the following Hamming distance problem 
on n-bit strings x and y: 

llKM!^\x,y) = 1 iff the Hamming distance between x and y is /S.{x,y) < d. 

For d = 0, this is just the equality problem. Yao showed i?ll'^"^(HAMi'^^) = 0{d^) (actually, 
a better classical protocol may be derived from the earlier paper iFIM+nij l. We can derive a 
threshold embedding directly from Yao's classical construction in |Yaon3| Section 4]. There, the 
length of the messages sent by the parties is m = @{d'^). The referee accepts only if the Hamming 
distance between the messages is below a certain threshold t = Q(m). Let Urx be Alice's message on 
random string r and input x, arxi be the i-th bit of this message, and similarly for Bob. Again we 
may assume r ranges over a set of size n' = 0{n) |New91j . Yao shows that for uniformly random 
r and i, Fi[arxi = bryi] < t/m — Q{l/d) if A{x,y) < d, and Pic[arxi = bryi] > t/m + B(l/d) if 
A(x,y) > d. Here t/m = 0(1). Now define the following (log(n') + 21og((i) + l)-qubit states: 



Then 



=^y]l'^) 12 l«)|ar«) and = V |r) V \i)\bryi) 

r l<i<m r l<i<in 



r l<i<m 



This is a threshold embedding of HAM^'^^ with 61 — 60 = @{l/d), so the margin complexity of this 
problem is 7(HAMi'^^) = ^{1/d). We have not found this result anywhere else in the literature on 
maximum margin realizations and believe it is novel. 
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3.3 Application: a margin-based lower bound on Q^"'^{f) 

Let us consider again the unit vectors (a.k.a. quantum states) Ox and Py constructed in Section 
from a quantum protocol for function / with q = Q^'^^{f) qubits of communication. These states 
form a (d, (5o, (5i)-threshold embedding of / with 5i — 5q = 0(2"^''). By Lemma |21 this in turn 
imphes that the maximal achievable margin among all realizations of / is 7(/) = r2(2~^''), which 
translates into a lower bound on quantum communication complexity in terms of margins: 

Theorem 7 > i log(l/7(/)) - 0(1). 

Since almost all / have exponentially small maximal margin |LMSS()5[ Section 5], it follows that 
almost all / have linear communication complexity even for multi-round protocols with unlimited 
prior entanglement. As far as we know, this is a new result (albeit not a very surprising one). 

The last theorem has been independently obtained by Linial and Shraibman [LS06I . Even 
more interestingly, they actually showed a linear relation between margin complexity l/'y{f) and 
discrepancy. Hence they extend the discrepancy lower bound to Q^^^{f)- It was already known to 
hold for Q{f) without entanglement |Kre95j . 

4 Discussion 

Our simulation is relevant for the longstanding open question regarding the power of quantum 
entanglement in communication complexity: how much can we reduce communication complexity 
by giving the parties access to unlimited amounts of EPR-pairs? No good upper bounds are known 
on the largest amount of entanglement (shared EPR-pairs) that is "still useful" . This is in contrast 
to the situation with shared randomness, where Newman's theorem shows that in the standard 
one-round or multi-round setting, O(logn) shared coin flips suffice |New91j . and hence shared 
randomness can save at most O(logn) communication.^ Like Shi's result |Shin5j . our result does 
not give an upper bound on the amount of prior entanglement that is needed, but it does imply 
that adding large amounts of prior entanglement can reduce the communication no more than 
exponentially. 

An interesting direction is to tap into the vast literature on maximal-margin classification and 
support vector machines (SVM's) to find more natural communication problems having efficient 
quantum fingerprinting protocols. Currently, the only natural and nontrivial example we have of 
this is the equality problem from |B('WWnij and its variations in Section |31 Every learning problem 
involving a concept class C over the set of n-bit strings corresponds to a |C| x 2" communication 
complexity problem. If the learning problem can be embedded with large margin (7 > l/(log n)^^^\ 
say), the communication problem has an efficient quantum fingerprinting protocol. 

A fascinating line of research which combines our main results is the following. Our Theorem 
together with the characterization of repeated fingerprinting in Theorem El opens the possibility to 
derive new lower bounds on the maximum margin of a sign matrix. It is sufficient to give an efficient 
multi-round quantum communication protocol (even with unlimited pre-shared entanglement) for 
a Boolean function to show that the corresponding concept class can be learned efficiently - yet 
another interesting possibility of proving classical results the quantum way. Conversely, strong 

•^In fact, Jain et al. |JRS05| show that Newman's blackbox-type proof, which keeps the protocol the same and 
just reduces the set of random strings to 0{n) elements, cannot be lifted to the quantum setting to get a significant 
reduction in the amount of entanglement used. 
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upper bounds on maximum margin, like the one of Linial et al. in Theorem |S1 give lower bounds on 
the communication complexity in the multi-round quantum communication model with unlimited 
shared entanglement. 
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